To appear in Communications in Statistics - Theory and Methods, 38:16, 3225 - 3239, 2009. 



NUMERICAL COMPARISON OF CUSUM AND SHIRYAEV-ROBERTS PROCEDURES 
FOR DETECTING CHANGES IN DISTRIBUTIONS 

George V. Moustakides 

Department of Electrical and Computer Engineering 
^ University of Patras 

^ ! 26500 Rio, Greece 



o 

u 



> 

On 



On 
O 



moustaki@upatras.gr 



bJQ. 

< 

Aleksey S. Polunchenko and Alexander G. Tartakovsky 



Department of Mathematics 
University of Southern California 
Los Angeles, CA 90089 



{ polunche , t art akov } @ use . edu 



Key Words: CUSUM test; Fredholm integral equation of the second kind; numerical analysis; 
quickest change-point detection; sequential analysis; Shiryaev- Roberts test. 



ABSTRACT 

The CUSUM procedure is known to be optimal for detecting a change in distribution under a 



minimax scenario, whereas the Shiryaev- Roberts procedure is optimal for detecting a change 
that occurs at a distant time horizon. As a simpler alternative to the conventional Monte 
Carlo approach, we propose a numerical method for the systematic comparison of the two 
detection schemes in both settings, i.e., minimax and for detecting changes that occur in 
the distant future. Our goal is accomplished by deriving a set of exact integral equations for 
the performance metrics, which are then solved numerically. We present detailed numerical 
results for the problem of detecting a change in the mean of a Gaussian sequence, which 
show that the difference between the two procedures is significant only when detecting small 
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changes. 



1. INTRODUCTION 



Quickest (sequential) change-point detection deals with detecting changes in distributions 
that occur at unknown points in time. The goal is to detect the change as soon as possible 
after its occurrence, while maintaining a prescribed false alarm level. A sequential change- 
point detection procedure is defined as a stopping time T (with respect to an observed 
sequence (X n } n >i). 

In this paper we consider the simplest version of the change-point detection problem 
where we assume that the observations are independent and identically distributed (i.i.d.) 
before the change with a common density / and i.i.d. with a different density g after the 
change, both of which are considered known. Our goal is to provide a comparative study 
of the main competitors - the Cumulative Sum (CUSUM) procedure introduced by Page 
(1954) and the Shiryaev-Roberts procedure introduced by Shiryaev (1961) for the Brownian 
motion case and Roberts (1966) for discrete time. 

It is known that both schemes enjoy specific optimality properties under different optimal- 
ity criteria. More precisely, it follows from Moustakides (1986) that the CUSUM procedure 
is (min-max) optimal with respect to Lorden's (1971) detection measure 



in the class A 7 = {T: E^fT] > 7} of detection procedures for which the average run length 
(ARL) to false alarm E^fT] is no smaller than a given number 7 > 1. Hereafter E u denotes 
the operator of expectation when the point of change is v [y = 00 means that there is no 
change) and y + stands for the positive part of y. On the other hand, it follows from Pollak 
and Tartakovsky (2009) that the Shiryaev-Roberts procedure is optimal with respect to the 
relative integral average detection delay measure 



again within the same class A 7 . This measure is also equivalent to the stationary average 
detection delay when detecting changes occurring at a distant time horizon (see Section 2 for 
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J L (T) = sup ess sup E„[(T-v) + \X 1 ,...,X v ] 



(1.1) 



RIADD(T) 



Eoo[T] 




further details). These latter performance measures and their corresponding properties were 
motivated by similar results obtained for the Shiryaev-Roberts procedure for the continuous- 
time Brownian motion model; see Shiryaev (1963) and Feinberg and Shiryaev (2006). Finally 
we should mention that the two tests are asymptotically optimal as 7 — > 00 (i.e., for low 
false alarm rate) with respect to both performance measures J7l and RIADD and for a class 
of observation processes that is much richer than the simple i.i.d. case (see, e.g., Lai, 1998 
and Tartakovsky and Veeravalli, 2004). 

It is of major practical interest to compare the two popular tests with respect to the two 
aforementioned measures, since each performance measure attempts to capture completely 
different change-point scenarios. The exact analytical characterization of the two perfor- 
mance measures was recently made possible by Moustakides et al. (2009) through a set of 
integral equations. These equations were in turn solved numerically using very simple tech- 
niques, yielding the final performance metrics. Due to the corresponding exact optimality 
properties, it is expected that CUSUM will outperform the Shiryaev-Roberts procedure with 
respect to Lorden's performance measure J7l, while the Shiryaev-Roberts procedure will be 
superior with respect to the relative integral average detection delay RIADD (T). Our goal 
is to quantify this difference and asses its importance. 

Comparisons of the two tests have been performed in the past. Roberts (1966) considered 
a change in the mean of a Gaussian sequence and the two tests were compared with respect 
to their ARL to detection E [T] value using Monte Carlo simulations. CUSUM was found 
to be better and this is not surprising since E [T], in both tests, coincides with Lorden's 
measure. Pollak and Siegmund (1985) performed a comprehensive asymptotic study (as 
7 — > 00, i.e., for low false alarm rate) for the problem of detecting a change in the drift 
of the Brownian motion and found that CUSUM performs better for changes that occur 
in the beginning (i.e., v — 0), while the Shiryaev-Roberts procedure outperforms CUSUM 
with respect to the conditional average detection delay ~E U [T — u\T > u] when v — > 00. 
Srivastava and Wu (1993) also presented an asymptotic analysis (as 7 — > 00) for Brownian 
motion but for the stationary average detection delay case. Tartakovsky and Ivanova (1992) 
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obtained accurate asymptotic approximations for the ARL to false alarm and the average 
detection delay for the processes with i.i.d. increments (in continuous and discrete time) and 
performed a detailed numerical comparison of the CUSUM and Shiryaev-Roberts procedures 
for an exponential model. Finally, Dragalin (1994) analyzed the CUSUM procedure for the 
problem of detecting a change in the mean of the normal distribution in terms of the ARL 
to false alarm E^T] and the ARL to detection E [T], using a precise numerical technique. 

Despite the previously mentioned results, a comprehensive comparison of the two tests 
for the discrete-time model in a non- asymptotic setting, i.e., for arbitrary false alarm rate, 
is still missing. In the present paper we give a partial answer to this question by proposing a 
technique that can perform the desired comparison numerically, being however of sufficient 
generality to include any i.i.d. observation model. 

The paper is organized as follows. In Section 2 we provide a brief overview of results 
in change-point detection, introduce our notation and describe the CUSUM and Shiryaev- 
Roberts procedures. In Section 3 we derive integral equations for the performance metrics of 
interest and provide a simple numerical solution that allows for efficient computation of the 
operating characteristics. In Section 4 we present the results of our numerical methodology 
in the example of detecting a change in the mean of a Gaussian sequence. 

2. CHANGE-POINT DETECTION PROCEDURES 
2.1 Notation and Problem Formulation 

Let a sequence {X n } n >i of independent random variables be observed sequentially. Initially 
the sequence is "in-control" , i.e., all observations are coming from the same probability 
density f(x). At an unknown time instant v > something happens and the sequence 
runs "out of control" by abruptly changing its statistical properties so that from v + 1 on 
the density is g(x) ^ f(x). This change has to be detected as quickly as possible, while 
controlling false alarms at a given level. 

Given the sequence {X n } n >i, a sequential detection procedure is identified with a stop- 
ping time T adapted to the filtration {J- n } n >o, where jF n = a(Xi, . . . ,X n ) is the (smallest) 
a-algebra generated by the observations up to time instant n, with JF denoting the trivial 
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a-algebra. In other words, for n > 0, the set {T < n} belongs to the a-algebra T n . At 
time instant T the procedure stops and declares that a change has occurred. The design of 
quickest change-point detection procedures involves optimizing a tradeoff between two types 
of performance metrics, one being a measure of the detection delay and the other of the 
rate of false alarms. Let us denote with P„ and K u the probability and the corresponding 
expectation induced by a change occurring at time v > 0. According to this definition P^ 
(Eqo) denotes the probability (expectation) when there is no change, while P and E the 
corresponding quantities when the change takes place before observations become available. 

We are interested in two different mathematical setups. In the first we follow the mini- 
max approach proposed by Lorden (1971) and expressed through (1.1). A similar measure, 
seemingly less pessimistic (for a discussion see Moustakides, 2008), was proposed in Pollak 
(1985) where detection speed is expressed via the supremum average (conditional) detection 
delay 

SADD(T)= sup Ej,[T — u\T > v\. (2.1) 

0<i/<oo 

As we have mentioned in the introduction, Lorden (1971) proposed to minimize the measure 
defined in (1.1) in the class A 7 , i.e., subject to the constraint E^fT] > 7 imposed on the ARL 
to false alarm. Following the same principle, Pollak (1985) suggested a similar constrained 
optimization problem with Lorden's measure Jl{T) replaced by SADD(T). We should em- 
phasize that in the case of the two popular tests we have Jl(T) = SADD(T) = E [T]. 
Consequently, even though we will refer to SADD(T) as our first performance measure, one 
should keep in mind that, at the same time, we refer to Lorden's essential supremum measure 
as well. 

The second formulation aims at minimizing the relative integral average detection delay 
defined in (1.2) subject to the lower bound on the ARL to false alarm E^T] > 7 (i.e., the 
class A 7 ). As has been shown by Pollak and Tartakovsky (2009), this is instrumental in 
detecting a change that occurs in the distant future (large v) and is preceded by a stationary 
flow of false alarms. Specifically, consider a context in which it is of utmost importance to 
detect a real change as quickly as possible even at the expense of raising many false alarms 
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(using a repeated application of the same stopping rule) before the change occurs. This 
essentially means that the change-point v is substantially larger than the ARL to false alarm 
7 which, in this case, defines the mean time between (consecutive) false alarms. Let Ti, T 2 , . . . 

denote sequential independent repetitions of the stopping time T and let Tj = Ti+T 2 -\ YTj 

be the time of the j'-th alarm. Define I v = min{j > 1: Tj > is}. In other words, T Iv is the 
time of detection of a true change that occurs at v after l v — \ false alarms have been raised. 
Write 

STADD(T) = lim E v [T Iv - u] 

for the limiting value of the average detection delay that we will refer to as the stationary 
average detection delay (STADD). It follows from Theorem 2 in Pollak and Tartakovsky 
(2009) that 

STADD (T) = ^ fc=0 ^^~ k ^ = RIADD(T). (2.2) 

STADD(T) is the second performance measure we will adopt for our comparisons. 

We note that the stationary average detection delay measure STADD (T) has been first 
introduced by Shiryaev (1961, 1963) for the problem of detecting a change in the drift of a 
Brownian motion, where also the Shiryaev-Roberts procedure has been introduced for the 
first time and shown to be optimal with respect to STADD (T) in the class of procedures 
with Eoo[T] = 7. See also Feinberg and Shiryaev (2006). 

2.2 CUSUM and Shiryaev-Roberts Procedures 
For n > 1, define 

a 9(Xn) 

1 *-r 



f(x n y 

the "instantaneous" likelihood ratio between the post-change and pre-change hypotheses. To 
avoid complications we shall assume that A x is continuous. Yet, if need be, the case where 
A x is non-arithmetic can also be covered with a certain additional effort. 

Using the previous notation, the Shiryaev-Roberts procedure stops and raises an alarm 

at 

Tf = M{n > 1 : R n > A}, inf {0} = 00, 
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where R n is the Shiryaev-Roberts detection statistic defined as 

n n 
k=l j=k 

and A = > is a threshold chosen so that the false alarm constraint Eoo[T| R ] = 7 is met. 

It is straightforward to verify from (2.3) that the Shiryaev-Roberts statistic allows for 
the following convenient recursive representation 

Rn — (1 + Rn-l) A„, Ro = 0. 

Pollak and Tartakovsky (2009) showed that the Shiryaev-Roberts procedure is exactly 
optimal in the sense of minimizing the relative integral average detection delay RIADD(T) 
and hence due to (2.2) the stationary average detection delay STADD(T) for every 7 > 1. 

The CUSUM test is motivated by the maximum likelihood argument and is based on the 
comparison of the maximum likelihood ratio 



V n = max TT A fc 

KKn 11 

j=k 

with a positive threshold A, i.e., the CUSUM stopping time is defined as 

Tf = inf{n > 1 : V n > A}, inf{0} = 00. (2.4) 

It is easily verified that the statistic V n can be computed recursively as 

V n = max{l, K-i} An, V = 1. (2.5) 

Note that conventional Page's CUSUM statistic is given by 

W n = max{0, W n ^ + log A n }, W = 0. (2.6) 

Clearly, the trajectories of this statistic coincide with the trajectories of log V n on the positive 
half plane and, therefore, the CUSUM stopping time defined in (2.4) is equivalent to familiar 
Page's stopping time 

T\ G = M{n > 1 : W n > log A} 



whenever A > 1. Note also that, while not crucial for most practical purposes, the CUSUM 
procedure given by (2.4) and (2.5) is more general than the classical Page rule since it allows 
for thresholds A < 1 (the classical test with such thresholds stops in one step). 

Threshold A = A 7 is chosen in such a way that the ARL to false alarm meets the 
constraint E TO [Tjj^] = 7 exactly. While we use the same notation A for the thresholds in 
both the CUSUM and Shiryaev-Roberts procedures, to avoid confusion we stress that the 
thresholds are in fact fairly different for achieving the same false alarm rate. 

In the minimax setting, Lorden (1971) proved that CUSUM is asymptotically (as 7 — > 00) 
optimal in the sense of minimizing the Jl{T) over all stopping times T such that E^T] > 7. 
This result was later improved by Moustakides (1986) who showed that CUSUM is exactly 
optimal for every 7 > 1 (for a different proof, see Ritov, 1990). 

3. INTEGRAL EQUATIONS FOR PERFORMANCE METRICS AND NUMERICAL AP- 
PROXIMATIONS 

This section is devoted to our analytical methodology as applied to the Shiryaev-Roberts 
and CUSUM procedures. We follow the technique developed in Moustakides et al. (2009) 
for the generalized Shiryaev-Roberts procedure which can be initialized from any point Ro = 
r G [0, A] and not necessarily from as in the classical case we adopt here. 

We recall the important observation mentioned earlier that for both CUSUM and the 
Shiryaev-Roberts procedure Lorden's essential supremum measure J7~l(T) defined in (1.1) 
and Pollak's supremum measure SADD(T) defined in (2.1) are attained at v — 0, that is, 

J- L (Tf ) = SADD(Tf ) = E [Tf ], Jl(T| r ) = SADD(T| R ) = E [Tf ], 

where E [T] is the average detection delay when the change occurs before surveillance begins 
(also known as the ARL to detection). Therefore, in order to compare these procedures in 
the worst-case scenario it is sufficient to compute the ARL to detection. Since the CUSUM 
procedure is optimal with respect to Lorden's measure Jl{T) in the class A 7 , it is expected 
that it will perform better than the Shiryaev-Roberts procedure. On the other hand, since 
the Shiryaev-Roberts procedure is optimal with respect to the stationary average detection 
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delay STADD(T), it is expected that it will perform better than the CUSUM procedure 
when detecting distant changes. 

In order to unify the approach for both tests, consider a sequential scheme whose stopping 
time is of the form 

T A = inf{n >l:S n > A}, inf{0} = 00 (3.1) 
with the corresponding Markov detection statistic satisfying 

S n = aS n -i)K, n = l,2,..., (3.2) 

where So = s <E [0,A] is a given (fixed) starting point, A is a positive threshold and £(s) is 
a sufficiently smooth positive-valued (for all s G [0, A}) function. 

It is evident that both the CUSUM and Shiryaev-Roberts statistics are of this form. 
Indeed, for CUSUM £(S) = max{l, S} and for the Shiryaev-Roberts procedure £(S) = 1 + S. 
Next, we derive a set of equations for the performance metrics of the generic detection 
procedure defined in (3.1) and (3.2), which we can then easily adapt to the CUSUM and 
Shiryaev-Roberts procedures by selecting the appropriate form of £(S). 

For fixed A > and s G [0, A], define 4>i(s) = E^Ta], where % = {00, 0}. It is apparent 
that 0oo(s) = EoopV] is the ARL to false alarm and (j>o(s) = EqPa] is the ARL to detection. 
For k > and s G [0, A], define 8 k (s) = E k [(T A - fc)+] and let F^x) = P^Aj < x) denote 
the cumulative distribution function of the likelihood ratio Ai for % = {00, 0}. 

Using the Markov property of the statistic S n and the argument of Moustakides et al. 
(2009), we obtain 



and 




with the initial condition 5q(s) = K [Ta] = 0o( s ) an d the latter function satisfying (3.3). 
The integral equation (3.3) yields the ARL to false alarm E^p - ^] and the ARL to detection 
E [Ta] while (3.4) recursively computes E fc [(T^ — k) + ] as functions of the starting point 
s G [0,A]. 
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In order to compute the stationary average detection delay STADD(Ta) defined in (2.2), 

oo 

k=0 ■ 



we need to evaluate the integral average detection delay ip(s) = Y^T=o^k[(TA — k) + ]. Ac 



cording to our previous definitions we observe that 

oo 

#0=Z>(«)- (3-5) 



k=0 



To find a more convenient formula for ip(s), let us introduce a linear operator associated 
with the kernel K^x^y) = j^F^ (|^)) > which transforms a given function ( into a new 
function 77 as follows 

V(y) = ° 0(y) = ({x)lC 00 (x,y)dx. 
Jo 

Notice now that Sk(s), defined in (3.4), can be seen as the repetitive application of this linear 
operator onto the function Sq(s). In terms of this operator, equation (3.4) can be rewritten 

as 

pA pA 

5 k (s) = o 5 )(s) = / •••/ 8 (x ) )C 00 (xo,xi)dx ...)C 00 (xk-i,s)dxk-! 

Jo Jo . s v 

k times 



k times 

with the convention that (/C^ o 5 )(s) = S (s). Consequently, this operator representation 
of (3.4) enables one to turn (3.5) into the following Neumann series 



^) = x>( s ) = X>~ o5 °)( s )> 



k=o k=0 

which by the geometric series convergence theorem leads to the following equation 



^(s) = S (s) + f il>(x) 
Jo 







\-F 

dx °° 





dx. (3.6) 



The geometric series convergence theorem applies since the spectral radius of the operator 
JCoo(x,y) is strictly less than 1. The proof of the latter fact for the Shiryaev-Roberts proce- 
dure can be found in Moustakides et al. (2009). For the CUSUM procedure the argument 
is essentially the same. 

Note that functions 4>i(s) = 0f(s) and ip(s) = ip^(s) depend on £. Taking £(s) = max(l, s) 
and £(s) = 1 + s, integral equations (3.3) and (3.6) allow for the following computation of 
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the stationary average detection delay of the CUSUM and Shiryaev-Roberts procedures 

STADD(Ta) = V(O)/0oo(O), 

while we recall that the supremum average detection delay SADD^^) = 0o(O) is computed 
from equation (3.3) with £(s) = max(l,s) for CUSUM and = 1 + s for the Shiryaev- 
Roberts procedure. 

Observe that both equations (3.3) and (3.6) for % = {oo, 0} are Fredholm equations of the 
second kind (see, e.g., Petrovskii, 1957 and Kress, 1989). It is known that, provided 1 is not 
an eigenvalue of the kernel K,i(x,y) = J^Fj (|^y), these equations possess unique solutions. 
It is also worth emphasizing that throughout the paper, kernels K,i(x,y) are sufficiently 
smooth, because the likelihood ratio was assumed to be continuous. 

In general, it is not feasible to obtain analytical solutions since the corresponding integral 
equations are difficult to solve. Alternatively, we can attempt to solve these equations 
numerically. Efficient numerical schemes are developed in Kantorovich and Krylov (1958), 
Petrovskii (1957) and Atkinson and Han (2001). The most popular approach consists in 
applying a quadrature rule to approximate the integral appearing on the right-hand side 
of (3.3) and (3.6). Specifically, once the choice of a quadrature rule is made, the interval 
[0, A] is divided into a partition = x < x\ < . . . < xn = A, and the functions (f>i(x) are 
sampled at the breakpoints producing column vectors 4> { = [<f>i(xo), (f>i(xi), . . . , 4>i(x n)}' . The 
integral is then evaluated using the quadrature rule by the following simple matrix-vector 
multiplication 

A _ 

Ki(x, y) <pi(y) dy = Ktfa + e, 

where e is the approximation error, K j is a matrix that depends on the chosen quadrature 
rule and the partition {xi}, {yi}, and <p i = [0j(x o ), (f>i(xi), . . . , 0i(xjv)]' with <pi(x) denoting 
the approximation to <f>i(x). A similar argument applies to the equation of ip{x). 

Matrices Ki can be found using numerical integration. To this end, we will use the 
simplest method sampling the interval [0, A] equidistantly at the points Xj = yj = jh, j = 



L 



ii 



0, . . . , N with h = A/N and denning the (n, m)-element of matrices Ki of size iV-by-iV as 



{K ^ =F \iik))- F \wt))< l - ,hm - N - (3 - 7) 

Beyond the node points, the unknown function (f>i(x) is then evaluated as 

N 

= 1 + ^2ld(x,y j )(f)i(y j ). 

3=0 

Regardless of the specific form of pre and post-change densities, the dominant eigenvalue 
A max of the matrix defined by (3.7) for % = oo is strictly less than 1 (and positive). This 
follows from the following inequality 

Amax — 1 1 oolloo ' 

Combining all previous observations yields 

t^J + Kifa, * = {oo,0}, (3.8) 

j> = 4> + K oo j>, (3.9) 

where ^ = [0,(0), (f>i(h), 4>i{A)\ and rj) = [tp(0),^(h), . . .,^(A)]' with 4>i(x) and rj)(x) 
denoting the approximations to <f>i(x) and ip(x), respectively, and J — [1,1, ... , 1]'. 

Linear matrix equations (3.8) and (3.9) constitute a complete set of approximations to 
their corresponding exact integral counterparts. These equations can be solved either directly 
or iteratively. Direct methods are known to be more accurate, but the accuracy comes at 
a price of considerable memory requirements. Iterative methods, although less memory 
demanding, are less accurate. It is evident that the accuracy of the proposed numerical 
method strongly depends on the number of sample points iV: the larger it is, the finer the 
partition and the more accurate the numerical approximation. Such a conclusion follows 
from the analysis performed, e.g., in Kantorovich and Krylov (1958) and Atkinson and Han 
(2001). 

Fredholm equations for the ARL to false alarm Eoo[T] and the ARL to detection E [T], 
but only for the CUSUM procedure, have been previously considered in the literature (see, 
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e.g., Dragalin, 1994 and references therein). These equations rely on the classical form of 
CUSUM given in (2.6) and, therefore, differ from the ones presented in (3.3). The unified 
approach we propose here, in addition to the obvious advantage of being applicable to a 
whole class of procedures that includes the Shiryaev- Roberts test, CUSUM and EWMA 
(not treated here) as particular cases, also simplifies the computations for CUSUM. Indeed, 
note that in the conventional approach usually considered in the literature (in particular by 
Dragalin, 1994), the CUSUM statistic is considered as reflected from the unit barrieiQ, which 
generates a nonzero probability mass (atom) at 1. Consequently, point 1 requires special 
treatment, complicating the corresponding integral equations. This drawback disappears 
under the alternative form (2.5) we adopt here. As we can see, in our approach point 1 has 
zero probability like any other point in the interval [0, A], and therefore, Equation (3.3) is 
readily applicable. This in turn produces a non-negligible simplification in the corresponding 
numerics. Finally, we should mention that one of the key characteristics of our approach is 
its ability to provide integral equations for a multitude of performance measures, including: 
a) the ARL to false alarm and detection; b) the average detection delay for any arbitrary 
change-point point v > 0; and c) other performance metrics such as RIADD and STADD. 
To the best of our knowledge such pluralism of performance characteristics has never been 
offered before. 

Next we apply the proposed numerical methodology to the Gaussian example and we 
compare the performance of the two popular tests, namely the CUSUM and Shiryaev- Roberts 
procedures. We note that it is the first time that such computations are performed for the 
Shiryaev- Roberts test. 

4. AN EXAMPLE 

Consider a Gaussian example of detecting a change in the mean value where observations 

1 Here wc refer to the exponentially transformed CUSUM statistic e Wn , where W n is given by the recursion 
(2.6). 
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are i.i.d. A/"(0, 1) pre-change and i.i.d. J\f(9, 1), 9 ^ post-change. Specifically 

fix)— . exp < — — > and q (x) = , exp < >. 

Recall that we are interested in comparing the operating characteristics of the CUSUM and 
Shiryaev- Roberts detection procedures expressed via the stationary average detection delay 
STADD(T) on one hand and the supremum average detection delay SADD(T) on the other, 
both as functions of the ARL to false alarm E^fT]. As we mentioned before, for both 
procedures SADD(T) coincides with Lorden's essential supremum measure Jl(T) and with 
ARL to detection E [T]. We compute the desired performance metrics for values of the 
ARL to false alarm ARL(T) = E^T] between 1 and 10 4 and for characteristic values of the 
post-change mean 9 = {0.01, 0.1, 0.5, 1.0}. 

Before continuing with the presentation of our numerical results, it is worth mentioning 
that in order to evaluate the ARL to false alarm of the CUSUM and Shiryaev-Roberts 
procedures, it is important to obtain preliminary estimates of the threshold A to narrow 
the domain of search for satisfying the false alarm constraint with equality. For CUSUM we 
used the following first-order approximation 

ARL(Tf ) « 2A/(9v 2 ), 

which follows from Tartakovsky (2005), where constant < v < 1 is the subject of renewal 
theory. For the Gaussian model considered this constant can be computed numerically as 



where 




is the standard normal distribution function. Also, for small values of 9 Siegmund's corrected 
Brownian motion approximations are fairly accurate (cf. Siegmund, 1985). For the Shiryaev- 
Roberts procedure, we used the following approximation due to Pollak (1987): 

ARL(Tf) w A/v, 
14 



which is very accurate even for relatively small threshold values (A > 20). 

Figures dHU and Tables dHU show the operating characteristics for the aforementioned 
set of parameters. As expected, the CUSUM procedure outperforms the Shiryaev-Roberts 
procedure in the minimax scenario. The Shiryaev-Roberts procedure, on the other hand, 
performs better with respect to the stationary average detection delay for detecting distant 
changes using a repeated application of the same stopping rule. As we can see, the difference 
is significant only for small changes, visible for moderate changes, while the two procedures 
perform equally well for large changes. 

The precision of our numerical approximations was verified by using Monte Carlo tech- 
niques for several parameter values. In all cases, the difference was negligible. We also 
note that for the Gaussian example considered in this section, Dragalin (1994) proposed a 
different, more accurate but also computationally more demanding method for computing 
the ARL to false alarm E^Tf ] and the ARL to detection E [Tf] of the CUSUM proce- 
dure. Comparing our results with the outcome of this more complex approach shows that 
the difference is very small. This fact is an additional indication that our simple numerical 
method is of sufficiently high accuracy. 
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Figure 1: Operating characteristics of CUSUM and Shiryaev-Roberts procedures for 6 
0.01. 
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Figure 2: Operating characteristics of CUSUM and Shiryaev-Roberts procedures for = 0. 
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Table 1: Operating characteristics of CUSUM and Shiryaev-Roberts procedures for 6 = 0.01 
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Table 2: Operating characteristics of CUSUM and Shiryaev-Roberts procedures for 6 = 0.1 
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Table 3: Operating characteristics of CUSUM and Shiryaev-Roberts procedures for 9 = 0.5 
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Table 4: Operating characteristics of CUSUM and Shiryaev-Roberts procedures for 6 = 1.0 
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